Draft

Dynamic Pricing & Demand Analysis Suite

data-viz
XGBoost
analytics
notebook
python
Author

Atila Madai

Published

June 16, 2025

Dynamic Hotel Pricing Optimization

Project Goal

This project explores how to build a trading floor-inspired, real-time dynamic pricing engine for hotel inventory.

The system leverages: - Advanced demand forecasting models - Price elasticity modeling under competition - Revenue optimization via mathematical and ML-based methods - Real-time simulation and visualization for business impact assessment


Background

The global hotel market exceeds $1.3 trillion, with increasing adoption of dynamic pricing by major hotel chains and OTAs.
Building a flexible, API-first, ML-driven pricing engine represents a major opportunity both for industry players and as a standalone service offering.

Inspired by companies building real-time, data-driven marketplaces, this project demonstrates how such a system can be designed and implemented for the travel and hospitality sector.


Project Structure

DynamicHotelPricingOptimization/
├── data/                      # Raw and processed data
├── notebooks/                 # Jupyter notebooks with modeling and results
├── src/                       # Python scripts for core functionality
├── reports/                   # Generated visualizations and slides
└── tests/                     # (Optional) unit tests

Getting Started

Install dependencies:

#bash #pip install -r requirements.txt #

Next Steps

✅ Initial EDA and feature engineering
✅ Demand forecasting
✅ Price elasticity modeling
✅ Revenue optimization loop
✅ Simulation and dashboarding


Author

Atila Madai
This Article is in development


This post combines four related notebooks exploring dynamic pricing and demand analytics:

  1. EDA: Dynamic Hotel Pricing Optimization
  2. Demand Forecasting
  3. Price Elasticity Modeling
  4. Price Elasticity Modeling with XGBoost

1. EDA: Dynamic Hotel Pricing Optimization

🏨 Dynamic Hotel Pricing Optimization

📊 01_EDA - Exploratory Data Analysis

Goal: Explore the dataset, understand key patterns and prepare features for modeling.

hotel is_canceled lead_time arrival_date_year arrival_date_month arrival_date_week_number arrival_date_day_of_month stays_in_weekend_nights stays_in_week_nights adults ... deposit_type agent company days_in_waiting_list customer_type adr required_car_parking_spaces total_of_special_requests reservation_status reservation_status_date
0 Resort Hotel 0 342 2015 July 27 1 0 0 2 ... No Deposit NaN NaN 0 Transient 0.0 0 0 Check-Out 2015-07-01
1 Resort Hotel 0 737 2015 July 27 1 0 0 2 ... No Deposit NaN NaN 0 Transient 0.0 0 0 Check-Out 2015-07-01
2 Resort Hotel 0 7 2015 July 27 1 0 1 1 ... No Deposit NaN NaN 0 Transient 75.0 0 0 Check-Out 2015-07-02
3 Resort Hotel 0 13 2015 July 27 1 0 1 1 ... No Deposit 304.0 NaN 0 Transient 75.0 0 0 Check-Out 2015-07-02
4 Resort Hotel 0 14 2015 July 27 1 0 2 2 ... No Deposit 240.0 NaN 0 Transient 98.0 0 1 Check-Out 2015-07-03

5 rows × 32 columns

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 119390 entries, 0 to 119389
Data columns (total 32 columns):
 #   Column                          Non-Null Count   Dtype  
---  ------                          --------------   -----  
 0   hotel                           119390 non-null  object 
 1   is_canceled                     119390 non-null  int64  
 2   lead_time                       119390 non-null  int64  
 3   arrival_date_year               119390 non-null  int64  
 4   arrival_date_month              119390 non-null  object 
 5   arrival_date_week_number        119390 non-null  int64  
 6   arrival_date_day_of_month       119390 non-null  int64  
 7   stays_in_weekend_nights         119390 non-null  int64  
 8   stays_in_week_nights            119390 non-null  int64  
 9   adults                          119390 non-null  int64  
 10  children                        119386 non-null  float64
 11  babies                          119390 non-null  int64  
 12  meal                            119390 non-null  object 
 13  country                         118902 non-null  object 
 14  market_segment                  119390 non-null  object 
 15  distribution_channel            119390 non-null  object 
 16  is_repeated_guest               119390 non-null  int64  
 17  previous_cancellations          119390 non-null  int64  
 18  previous_bookings_not_canceled  119390 non-null  int64  
 19  reserved_room_type              119390 non-null  object 
 20  assigned_room_type              119390 non-null  object 
 21  booking_changes                 119390 non-null  int64  
 22  deposit_type                    119390 non-null  object 
 23  agent                           103050 non-null  float64
 24  company                         6797 non-null    float64
 25  days_in_waiting_list            119390 non-null  int64  
 26  customer_type                   119390 non-null  object 
 27  adr                             119390 non-null  float64
 28  required_car_parking_spaces     119390 non-null  int64  
 29  total_of_special_requests       119390 non-null  int64  
 30  reservation_status              119390 non-null  object 
 31  reservation_status_date         119390 non-null  object 
dtypes: float64(4), int64(16), object(12)
memory usage: 29.1+ MB
is_canceled lead_time arrival_date_year arrival_date_week_number arrival_date_day_of_month stays_in_weekend_nights stays_in_week_nights adults children babies is_repeated_guest previous_cancellations previous_bookings_not_canceled booking_changes agent company days_in_waiting_list adr required_car_parking_spaces total_of_special_requests
count 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 119386.000000 119390.000000 119390.000000 119390.000000 119390.000000 119390.000000 103050.000000 6797.000000 119390.000000 119390.000000 119390.000000 119390.000000
mean 0.370416 104.011416 2016.156554 27.165173 15.798241 0.927599 2.500302 1.856403 0.103890 0.007949 0.031912 0.087118 0.137097 0.221124 86.693382 189.266735 2.321149 101.831122 0.062518 0.571363
std 0.482918 106.863097 0.707476 13.605138 8.780829 0.998613 1.908286 0.579261 0.398561 0.097436 0.175767 0.844336 1.497437 0.652306 110.774548 131.655015 17.594721 50.535790 0.245291 0.792798
min 0.000000 0.000000 2015.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 6.000000 0.000000 -6.380000 0.000000 0.000000
25% 0.000000 18.000000 2016.000000 16.000000 8.000000 0.000000 1.000000 2.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 9.000000 62.000000 0.000000 69.290000 0.000000 0.000000
50% 0.000000 69.000000 2016.000000 28.000000 16.000000 1.000000 2.000000 2.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 14.000000 179.000000 0.000000 94.575000 0.000000 0.000000
75% 1.000000 160.000000 2017.000000 38.000000 23.000000 2.000000 3.000000 2.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 229.000000 270.000000 0.000000 126.000000 0.000000 1.000000
max 1.000000 737.000000 2017.000000 53.000000 31.000000 19.000000 50.000000 55.000000 10.000000 10.000000 1.000000 26.000000 72.000000 21.000000 535.000000 543.000000 391.000000 5400.000000 8.000000 5.000000
stay_length lead_time booking_month weekday
0 0 342 7 2
1 0 737 7 2
2 1 7 7 3
3 1 13 7 3
4 2 14 7 4
Cancellation Rate: 37.04%

Source: 🏨 Dynamic Hotel Pricing Optimization

2. Demand Forecasting

🏨 Dynamic Hotel Pricing Optimization

📈 02_Demand Forecasting

Goal: Build models to forecast demand (number of bookings) per hotel and date.

reservation_status_date bookings
0 2014-10-17 180
1 2014-11-18 1
2 2015-01-01 763
3 2015-01-02 16
4 2015-01-18 1

Source: 🏨 Dynamic Hotel Pricing Optimization

3. Price Elasticity Modeling

🏨 Dynamic Hotel Pricing Optimization

📉 03_Price Elasticity Modeling

Goal: Model how booking demand responds to changes in price, and estimate optimal pricing zones.

hotel is_canceled lead_time arrival_date_year arrival_date_month arrival_date_week_number arrival_date_day_of_month stays_in_weekend_nights stays_in_week_nights adults ... deposit_type agent company days_in_waiting_list customer_type adr required_car_parking_spaces total_of_special_requests reservation_status reservation_status_date
0 Resort Hotel 0 342 2015 July 27 1 0 0 2 ... No Deposit NaN NaN 0 Transient 0.0 0 0 Check-Out 2015-07-01
1 Resort Hotel 0 737 2015 July 27 1 0 0 2 ... No Deposit NaN NaN 0 Transient 0.0 0 0 Check-Out 2015-07-01
2 Resort Hotel 0 7 2015 July 27 1 0 1 1 ... No Deposit NaN NaN 0 Transient 75.0 0 0 Check-Out 2015-07-02
3 Resort Hotel 0 13 2015 July 27 1 0 1 1 ... No Deposit 304.0 NaN 0 Transient 75.0 0 0 Check-Out 2015-07-02
4 Resort Hotel 0 14 2015 July 27 1 0 2 2 ... No Deposit 240.0 NaN 0 Transient 98.0 0 1 Check-Out 2015-07-03

5 rows × 32 columns

price_bucket bookings
0 (0, 50] 7706
1 (50, 100] 33972
2 (100, 150] 21520
3 (150, 200] 7184
4 (200, 300] 2848
5 (300, 500] 187
6 (500, 1000] 2

                            OLS Regression Results                            
==============================================================================
Dep. Variable:               bookings   R-squared:                       0.598
Model:                            OLS   Adj. R-squared:                  0.518
Method:                 Least Squares   F-statistic:                     7.441
Date:                Sat, 05 Jul 2025   Prob (F-statistic):             0.0414
Time:                        20:15:04   Log-Likelihood:                -14.821
No. Observations:                   7   AIC:                             33.64
Df Residuals:                       5   BIC:                             33.53
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         19.5084      4.513      4.323      0.008       7.909      31.108
price_mid     -2.3678      0.868     -2.728      0.041      -4.599      -0.137
==============================================================================
Omnibus:                          nan   Durbin-Watson:                   1.033
Prob(Omnibus):                    nan   Jarque-Bera (JB):                1.053
Skew:                          -0.709   Prob(JB):                        0.591
Kurtosis:                       1.734   Cond. No.                         27.0
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Source: 🏨 Dynamic Hotel Pricing Optimization

4. Price Elasticity Modeling (XGBoost)

🏨 Dynamic Hotel Pricing Optimization

🚀 Advanced Elasticity Modeling → XGBoost Regressor

Goal: Model complex, non-linear price elasticity using XGBoost.

adr market_segment_encoded hotel_encoded day_of_week month bookings
0 -6.38 4 1 2 3 1
1 0.00 0 0 1 11 1
2 0.00 0 0 2 6 3
3 0.00 0 0 4 7 1
4 0.00 1 0 0 1 1
RMSE: 2.99

Source: 🏨 Dynamic Hotel Pricing Optimization